过去,用于数字化文件的计算机视觉系统可以依赖于系统捕获的高质量扫描。今天,涉及数字文件的交易更有可能从非专业人士拍摄的手机照片上传。因此,文档自动化的计算机愿景现在必须考虑自然场景上下文中捕获的文档。额外的挑战是,文档处理的任务目标可以是高度用例特定的,这使得公共数据集在其实用程序中有限,而手动数据标签也昂贵并且在使用情况之间翻译不当。要解决这些问题,我们创建了SIM2REAL文档 - 一个合成数据集的框架,并在自然场景中执行文档的域随机化。 SIM2REAL文档使使用BLENDER,一个用于3D建模和光线跟踪渲染的开源工具的文档的程序化3D渲染。通过使用渲染来模拟光,几何,相机和背景的物理交互,我们在自然场景上下文中综合文档数据集。每个渲染都与使用案例特定的地面真理数据配对,指定感兴趣的潜在特征,产生无限制的拟合培训数据。然后,机器学习模型的作用是为了解决渲染管道构成的逆问题。通过微调或调整域随机化参数,可以进一步迭代这种模型。
translated by 谷歌翻译
translated by 谷歌翻译
Embedding based product recommendations have gained popularity in recent years due to its ability to easily integrate to large-scale systems and allowing nearest neighbor searches in real-time. The bulk of studies in this area has predominantly been focused on similar item recommendations. Research on complementary item recommendations, on the other hand, still remains considerably under-explored. We define similar items as items that are interchangeable in terms of their utility and complementary items as items that serve different purposes, yet are compatible when used with one another. In this paper, we apply a novel approach to finding complementary items by leveraging dual embedding representations for products. We demonstrate that the notion of relatedness discovered in NLP for skip-gram negative sampling (SGNS) models translates effectively to the concept of complementarity when training item representations using co-purchase data. Since sparsity of purchase data is a major challenge in real-world scenarios, we further augment the model using synthetic samples to extend coverage. This allows the model to provide complementary recommendations for items that do not share co-purchase data by leveraging other abundantly available data modalities such as images, text, clicks etc. We establish the effectiveness of our approach in improving both coverage and quality of recommendations on real world data for a major online retail company. We further show the importance of task specific hyperparameter tuning in training SGNS. Our model is effective yet simple to implement, making it a great candidate for generating complementary item recommendations at any e-commerce website.
translated by 谷歌翻译